-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(consensus): node stalled after client has stopped #1001
Conversation
WalkthroughThe changes in this pull request introduce a new error variable, Changes
Possibly related PRs
Suggested reviewers
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (1)
abci/client/socket_client.go (1)
41-44
: Document theErrClientStopped
error variable for clarity.The addition of the exported variable
ErrClientStopped
improves error handling consistency. To enhance code readability and maintainability, consider adding a documentation comment explaining the purpose and usage ofErrClientStopped
.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
abci/client/socket_client.go
(2 hunks)internal/consensus/state_try_add_commit.go
(2 hunks)
🔇 Additional comments (1)
abci/client/socket_client.go (1)
241-241
: Verify consistency of error handling with ErrClientStopped
.
By returning ErrClientStopped
instead of a new error instance, ensure that all error comparisons in the codebase use errors.Is
or errors.As
to handle this error. Direct comparisons using ==
with error variables may fail due to different error instances.
To confirm consistent error handling, run the following script:
✅ Verification successful
Error handling for ErrClientStopped
is consistent with Go best practices
The error is correctly defined as a package-level variable and returned directly without wrapping. Since there are no direct equality comparisons (==
) found in the codebase and the error is returned as-is, the current implementation follows Go's error handling best practices. No changes are needed.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Search for locations where the "client has stopped" error is handled.
# Find all Go files except tests
FILES=$(fd --type f --extension go --exclude '*_test.go')
# Search for direct equality comparisons with error strings
rg -F 'err == "client has stopped"' $FILES
# Search for error comparisons with ErrClientStopped
rg -F 'err == ErrClientStopped' $FILES
# Search for error handling using errors.Is with ErrClientStopped
rg -F 'errors.Is(err, ErrClientStopped)' $FILES
Length of output: 64720
Script:
#!/bin/bash
# Let's search for ErrClientStopped definition and usage patterns
# Find where ErrClientStopped is defined
rg -l "ErrClientStopped.*=.*"
# Search for error handling patterns with ErrClientStopped
rg -A 2 "if.*err.*{" | rg -B 2 "ErrClientStopped"
# Search for error returns with ErrClientStopped
rg "return.*ErrClientStopped"
# Search for error wrapping with ErrClientStopped
rg "fmt.Errorf.*ErrClientStopped"
Length of output: 322
Script:
#!/bin/bash
# Let's examine the ErrClientStopped definition and its usage context
# Show the error definition and surrounding context
rg -B 5 -A 5 "ErrClientStopped.*=.*" abci/client/socket_client.go
# Show all functions that return or use ErrClientStopped
rg -B 10 -A 2 "ErrClientStopped" abci/client/socket_client.go
# Look for any error handling patterns in test files
rg "ErrClientStopped" abci/client/*_test.go
Length of output: 1103
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (2)
internal/consensus/state_test.go (2)
3286-3286
: Unnecessary use oft.Helper()
in the test functionThe
t.Helper()
method is intended to be used within helper functions to mark them as helpers. Calling it directly within the test functionTestStateTryAddCommitPanicsOnClientError
is unnecessary.Apply this diff to remove the unnecessary call:
- t.Helper()
3350-3352
: Verify the panic message to ensure correct behaviorCurrently, the test checks for any panic but does not verify if it's the expected one. To ensure the test is validating the correct panic scenario, consider checking the panic value.
Apply this diff to verify the panic message:
- assert.Panics(t, func() { + assert.PanicsWithValue(t, abciclient.ErrClientStopped, func() { _ = consensusState.ctrl.Dispatch(ctx, &TryAddCommitEvent{Commit: commit, PeerID: peerID}, &stateData) })
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (3)
internal/consensus/state_test.go (3)
3278-3354
: LGTM! Well structured test for ABCI client error handlingThe test effectively verifies that the consensus state panics when encountering an ABCI client error during block processing. The test setup and assertions are clear and comprehensive.
A few suggestions for improvement:
- Consider adding cleanup of mocks in a defer statement
- The error message assertion could be extracted to a constant for reuse
- Consider adding a test case for other ABCI methods like PrepareProposal
func TestStateTryAddCommitPanicsOnClientError(t *testing.T) { + const expectedError = "ABCI client stopped, Tenderdash needs to be restarted: ProcessProposal abci method: client has stopped" ctx, cancel := context.WithCancel(context.Background()) defer cancel() + defer func() { + app.AssertExpectations(t) + }() // ... rest of the test assert.PanicsWithError(t, - "ABCI client stopped, Tenderdash needs to be restarted: ProcessProposal abci method: client has stopped", + expectedError, func() {
3294-3299
: Consider enhancing mock error handlingThe mock configuration is good but could be enhanced:
- Add verification that ProcessProposal was actually called
- Consider testing with different error types beyond ErrClientStopped
app := clientmocks.NewClient(t) +app.On("ProcessProposal", mock.Anything, mock.Anything). - Return(&abci.ResponseProcessProposal{}, abciclient.ErrClientStopped). - Once() + Run(func(args mock.Arguments) { + // Verify the arguments + req := args.Get(1).(*abci.RequestProcessProposal) + assert.NotNil(t, req) + }). + Return(&abci.ResponseProcessProposal{}, abciclient.ErrClientStopped). + Once()
3309-3323
: Consider adding more block validation checksThe block creation and commit handling is solid, but could benefit from additional validation:
- Validate block metadata matches expected values
- Add explicit checks for block parts
block, err := sf.MakeBlock(state, 1, &types.Commit{}, kvstore.ProtocolVersion) require.NoError(t, err) require.NotZero(t, block.Version.App) +require.Equal(t, state.LastBlockHeight+1, block.Height) +require.NotEmpty(t, block.LastBlockID) block.CoreChainLockedHeight = 1 commit, err := factory.MakeCommit( ctx, block.BlockID(nil), block.Height, 0, stateData.Votes.Precommits(0), state.Validators, privVals, ) require.NoError(t, err) +require.Equal(t, block.Height, commit.Height) +require.True(t, commit.BlockID.Equals(block.BlockID(nil)))
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
abci/client/socket_client.go
(2 hunks)internal/consensus/state_test.go
(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- abci/client/socket_client.go
Issue being fixed or feature implemented
Tenderdash does not recover after ABCI client errors and stops.
What was done?
Added proper error handling to panic on stopped client.
This causes restart of Tenderdash Docker container, reconnecting the client.
How Has This Been Tested?
Added unit test
Breaking Changes
None
Checklist:
For repository code-owners and collaborators only
Summary by CodeRabbit
Summary by CodeRabbit
New Features
ErrClientStopped
, for improved error reporting.Bug Fixes
Tests
Documentation